Skip to content

fix(python): add backward compatibility for SandboxClaim status.sandbox.Name#515

Merged
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
kincoy:fix/sandboxclaim-name-case-compatibility
Apr 6, 2026
Merged

fix(python): add backward compatibility for SandboxClaim status.sandbox.Name#515
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
kincoy:fix/sandboxclaim-name-case-compatibility

Conversation

@kincoy
Copy link
Copy Markdown
Contributor

@kincoy kincoy commented Apr 3, 2026

Adds backward compatibility for status.sandbox.name in the Python client.

PR #440 renamed the field from Name to name but the change isn't released yet. Older controllers still return Name, breaking things for users (#505).

The code review bot in #440 actually recommended adding a fallback to check both fields, but it didn't make it into the final PR. This adds that fallback - it's just one line and lets the Python client work with any controller version.

Fixes #505

@netlify
Copy link
Copy Markdown

netlify bot commented Apr 3, 2026

Deploy Preview for agent-sandbox canceled.

Name Link
🔨 Latest commit 3d7bd64
🔍 Latest deploy log https://app.netlify.com/projects/agent-sandbox/deploys/69cf872773398b0008d31013

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 3, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Welcome @kincoy!

It looks like this is your first PR to kubernetes-sigs/agent-sandbox 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/agent-sandbox has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 3, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @kincoy. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Apr 3, 2026
@kincoy kincoy force-pushed the fix/sandboxclaim-name-case-compatibility branch from bd3b4e2 to 3d7bd64 Compare April 3, 2026 09:23
@SHRUTI6991
Copy link
Copy Markdown
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 6, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kincoy, SHRUTI6991

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 6, 2026
@k8s-ci-robot k8s-ci-robot merged commit 8112b33 into kubernetes-sigs:main Apr 6, 2026
9 checks passed
johannhartmann added a commit to mayflower/agent-sandbox that referenced this pull request Apr 8, 2026
Running the vitest integration suite against a real kind cluster
(LANGCHAIN_SANDBOX_TEMPLATE=df-standard, LANGCHAIN_NAMESPACE=darkfactory,
KUBECONFIG=bin/KUBECONFIG) surfaced three bugs inherited from the
original source. All 97 standard tests now pass after these fixes.

1. K8sClient: loadFromCluster silently broken outside a pod
   `@kubernetes/client-node`'s loadFromCluster() does not throw when
   called outside a pod; it returns a config whose cluster server URL
   is `https://undefined:undefined`, which then fails at request time
   with "Invalid URL". Detect the in-pod environment explicitly via the
   KUBERNETES_SERVICE_HOST env var (set by kubelet for every pod) and
   only call loadFromCluster() there. Outside a pod, use
   loadFromDefault() which properly honors $KUBECONFIG.

2. K8sClient: status.sandbox.name vs Name backcompat
   Older controllers serialize the resolved sandbox name as `Name`
   (capital N) because the Go struct tag was omitted. Newer controllers
   use `name`. The TS client only checked `.name`, so resolveSandboxName
   timed out against the 11-day-old controller in the test cluster.
   Read both field names — mirrors the Python SDK's own backcompat fix
   in kubernetes-sigs#515.

3. K8sAgentSandbox.create: ignored initialFiles option
   The deepagents-js shared sandbox standard test suite passes
   initialFiles: Record<string, string> to the createSandbox callback
   to seed files before the agent runs. K8sAgentSandbox.create() took
   the options and dropped initialFiles on the floor, causing 6 tests
   in the initialFiles describe block to fail. Add the field to
   K8sAgentSandboxCreateOptions and upload after initialize(), aborting
   (and deleting the claim) on any upload failure.

Test results:

- 86 unit tests pass (unchanged)
- 97 vitest int tests pass against the real kind cluster:
    - sandbox lifecycle (3)
    - command execution (6)
    - file operations (6)
    - write (10)
    - read (12)
    - edit (12)
    - ls (3)
    - grep (11)
    - glob (11)
    - initialFiles (7) — all now passing after fix kubernetes-sigs#3
    - integration workflows (2)
    - K8sAgentSandbox provider-specific (5)
johannhartmann added a commit to mayflower/agent-sandbox that referenced this pull request Apr 8, 2026
Ports the K8sAgentSandbox provider from
langchain-ai/deepagentsjs (feat/k8s-agent-sandbox branch) into this
repo as clients/typescript/langchain-agent-sandbox, mirroring the
Python clients/python/langchain-agent-sandbox package. Upgraded to
deepagents >=1.9 (the new V2 BackendProtocol) and integrated into
the repo's kind e2e suite. npm package name: langchain-agent-sandbox
(unscoped, matches the Python sibling).

## Package contents

clients/typescript/langchain-agent-sandbox/:
- src/sandbox.ts      K8sAgentSandbox extends BaseSandbox with
                      execute / uploadFiles / downloadFiles / id +
                      static factories fromUrl / create / fromExisting /
                      deleteAll. File operations (ls / read / readRaw /
                      grep / glob / write / edit) are inherited from
                      the V2 BaseSandbox defaults that delegate to
                      execute() with POSIX shell.
- src/http-client.ts  SandboxRouterClient — HTTP transport layer that
                      talks to the sandbox-router-svc with
                      X-Sandbox-ID / X-Sandbox-Namespace / X-Sandbox-Port
                      headers. Same wire protocol as the Python and
                      Go SDKs.
- src/k8s-client.ts   K8sClient — thin wrapper around
                      @kubernetes/client-node for SandboxClaim CRUD,
                      resolveSandboxName, waitForSandboxReady, Gateway
                      IP discovery, and label-selector-filtered listing.
- src/connection.ts   Three connection strategies: DirectConnectionStrategy,
                      GatewayConnectionStrategy, TunnelConnectionStrategy
                      (kubectl port-forward subprocess).
- src/types.ts        Connection configs, K8sAgentSandboxOptions,
                      K8sAgentSandboxCreateOptions (including
                      initialFiles), K8sAgentSandboxError class with
                      typed code union including COMMAND_TIMEOUT.
- src/index.ts        Public exports.
- src/*.test.ts       4 unit test files, 102 tests, fully mocked.
- src/sandbox.int.test.ts  Integration suite wired to
                           @langchain/sandbox-standard-tests (97
                           behavior tests covering sandbox lifecycle,
                           command execution, file operations, write,
                           read, edit, ls, grep, glob, initialFiles,
                           and integration workflows).
- package.json, tsconfig.json, tsdown.config.ts, vitest.config.ts,
  .gitignore, .npmignore, README.md, pnpm-lock.yaml.

examples/langchain-deepagentsjs/: minimal end-to-end example with
main.ts (Anthropic + deepagents + K8sAgentSandbox), package.json,
README.md, sandbox-template.yaml, .gitignore.

## Repo integration

- Makefile: new `test-langchainjs` target runs
  `pnpm install --frozen-lockfile && pnpm typecheck && pnpm test`.
- .github/workflows/test-langchainjs.yml: PR-time unit test CI on
  Node 20 + 22 matrix when clients/typescript/langchain-agent-sandbox
  changes. Matches the existing minimal-permissions pattern.
- dev/tools/test-e2e: setup_typescript_sdk installs the package if
  pnpm is present (hard fails on `pnpm install --frozen-lockfile`
  failure to prevent stale-node_modules silent passes);
  run_typescript_e2e_tests runs vitest --mode int with
  KUBECONFIG=bin/KUBECONFIG and outputs bin/e2e-typescript-junit.xml.
  Both functions short-circuit cleanly when pnpm, the package dir, or
  LANGCHAIN_SANDBOX_TEMPLATE is missing, so they never block the Go
  and Python suites from running.
- dev/ci/presubmits/test-e2e: copies the new junit artifact into
  $ARTIFACTS when present.
- test/e2e/README.md: documents the shared sandbox-router dependency
  for all three SDKs (Python, Go, TS) as a first-class deployment
  prerequisite, with the full build + IMAGE_PLACEHOLDER substitution
  + kubectl apply workflow.
- Root README.md: TypeScript SDK section after the Python SDK
  section.

## deepagents-js V2 protocol

The existing upstream code extended BaseSandbox (V1) and implemented
only the four abstract members (execute, uploadFiles, downloadFiles,
id). Against deepagents@^1.9.0 those four members satisfy the V2
SandboxBackendProtocolV2 contract unchanged, and the file operations
inherit from the V2 BaseSandbox defaults. No method rewrites were
required — only targeted hardening (see below).

## Post-port hardening

Eight bugs / gaps surfaced during the migration were fixed in this
same commit:

1. execute() COMMAND_TIMEOUT detection: the original port wrapped
   every fetch() rejection in http-client.ts as CONNECTION_FAILED,
   including the DOMException raised by AbortSignal.timeout. The new
   http-client.ts lets errors named TimeoutError / AbortError
   propagate without wrapping so execute()'s catch block can
   distinguish them and emit a typed COMMAND_TIMEOUT error.

2. K8sClient constructor: @kubernetes/client-node's loadFromCluster()
   does not throw outside a pod — it silently creates a config with
   server URL https://undefined:undefined, which crashes at request
   time. Fixed by checking process.env.KUBERNETES_SERVICE_HOST to
   select loadFromCluster() vs loadFromDefault() explicitly. The
   in-pod path still works unchanged.

3. resolveSandboxName: older controllers (pre upstream PR kubernetes-sigs#440)
   serialize the field as status.sandbox.Name (capital N) because
   the Go JSON tag was explicitly `json:"Name,omitempty"`. PR kubernetes-sigs#440
   renamed it to `json:"name,omitempty"`. The extract callback now
   reads both field names with lowercase precedence, mirroring the
   Python SDK's backcompat shim in upstream PR kubernetes-sigs#515.

4. create() initialFiles: the deepagents-js shared standard test
   suite passes `initialFiles: Record<string, string | Uint8Array>`
   in createSandbox options to pre-populate files. Added the field
   to K8sAgentSandboxCreateOptions and upload-after-initialize
   handling with full cleanup on failure — on any upload error,
   sandbox.close() is called first to tear down the tunnel subprocess
   (preventing a `kubectl port-forward` leak), then the claim is
   deleted.

5. listSandboxClaims + deleteAll label filtering: the original port
   accepted a labels argument but silently ignored it, deleting
   every claim in the namespace. listSandboxClaims now accepts an
   optional labels record that is validated (RFC-1123 Kubernetes
   label regex) and serialized into a labelSelector. Values
   containing "," or "=" are rejected at serialization time so a
   label like {owner: "alice,env=prod"} can't silently widen into a
   two-predicate selector. deleteAll also refuses an empty labels
   object unless the caller passes { confirmDeleteAll: true },
   eliminating the original "delete everything" footgun by default.

6. execute() timeout distinction: new COMMAND_TIMEOUT error code
   lets callers distinguish a per-command timeout from a generic
   CONNECTION_FAILED / COMMAND_FAILED. Verified end-to-end against
   a real kind cluster — `sleep 30` with defaultTimeout=3 terminates
   at exactly 3005ms with the right typed error.

7. int test env-var gating: the LANGCHAIN_SANDBOX_TEMPLATE /
   LANGCHAIN_NAMESPACE env var names match the Python langchain-
   agent-sandbox e2e test for symmetry, with the legacy
   SANDBOX_TEMPLATE / SANDBOX_NAMESPACE names kept as fallbacks.
   sandboxStandardTests({ skip: !template, ... }) uses the shared
   suite's built-in skip config so the run stays clean when the
   template is unset.

8. docs: the sandbox-router Deployment manifest at
   clients/python/agentic-sandbox-client/sandbox-router/sandbox_router.yaml
   has a literal `image: IMAGE_PLACEHOLDER` line that requires
   build + sed substitution before applying — applying it raw
   produces a CrashLoopBackOff. Both READMEs now document the full
   build + substitute + apply workflow and link to the router's own
   README for configuration details.

## Verification

- 102 unit tests pass (`pnpm test`)
- 97 integration tests pass against a real kind cluster
  (`LANGCHAIN_SANDBOX_TEMPLATE=df-standard
    LANGCHAIN_NAMESPACE=darkfactory pnpm test:int`) — covers the
  full deepagents-js shared standard suite (sandbox lifecycle,
  command execution, file operations, write, read, edit, ls, grep,
  glob, initialFiles, integration workflows) plus provider-specific
  lifecycle tests.
- pnpm typecheck clean
- pnpm build produces dual ESM + CJS output
- make test-langchainjs clean
- All 14 .ts files carry the Apache-2.0 / Kubernetes Authors header
- No `any` introduced; #private fields used throughout

## Source attribution

Ported from langchain-ai/deepagentsjs feat/k8s-agent-sandbox branch
under Apache-2.0; the upstream source was MIT-licensed. All files
carry the kubernetes-sigs Apache-2.0 header.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area:rich-client cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] resolve_sandbox_name fails with Alibaba ACK due to case mismatch in sandbox status field

4 participants